Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add nrow and ncol #285

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

rikhuijzer
Copy link

@rikhuijzer rikhuijzer commented Jun 29, 2022

I noticed that there is a rowcount, but not a columncount and that length(columnnames(cols)) was used in the codebase to count the number of columns (because that is the fastest?). This PR suggests to add a columncount function.

@bkamins
Copy link
Member

bkamins commented Jun 29, 2022

The DataAPI.jl defines nrow and ncol as a public API for this. rowcount is internal. If we are at it it would be probably better to properly implement default nrow and ncol implementations in this PR.

@rikhuijzer
Copy link
Author

rikhuijzer commented Jun 29, 2022

If we are at it it would be probably better to properly implement default nrow and ncol implementations in this PR.

That would be great yes. Then Tables.jl sources can provide more efficient methods.

rowcount is internal.

How can I make nrow and ncol public for Tables.jl? I'm not sure what the convention is here

Copy link
Member

@bkamins bkamins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I see that what I wanted is a minimal type piracy, but I think it is OK.

@bkamins
Copy link
Member

bkamins commented Jun 29, 2022

How can I make nrow and ncol public for Tables.jl? I'm not sure what the convention is here

You do not need to do anything more than you did. Tables.jl does not export anything, but nrow and ncol are public because they implement public interface defined in DataAPI.jl.

@codecov
Copy link

codecov bot commented Jun 29, 2022

Codecov Report

Merging #285 (f094263) into main (60c080c) will decrease coverage by 0.55%.
The diff coverage is 50.00%.

❗ Current head f094263 differs from pull request most recent head acc8ac8. Consider uploading reports for the commit acc8ac8 to get more accurate results

@@            Coverage Diff             @@
##             main     #285      +/-   ##
==========================================
- Coverage   94.88%   94.32%   -0.56%     
==========================================
  Files           7        7              
  Lines         665      670       +5     
==========================================
+ Hits          631      632       +1     
- Misses         34       38       +4     
Impacted Files Coverage Δ
src/dicts.jl 92.66% <0.00%> (-1.74%) ⬇️
src/fallbacks.jl 97.90% <0.00%> (-0.69%) ⬇️
src/namedtuples.jl 98.66% <0.00%> (-1.34%) ⬇️
src/Tables.jl 88.00% <100.00%> (ø)
src/matrix.jl 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 60c080c...acc8ac8. Read the comment docs.

src/fallbacks.jl Outdated Show resolved Hide resolved
src/fallbacks.jl Outdated
# get the number of rows in the incoming table
function rowcount(cols)
"Return the number of rows in the incoming table."
function nrow(cols)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bkamins, does this not pirate a default definition? I'm a little uncomfortable with this definition because technically rowcount was only meant to be called on the result of Tables.columns, but now it "looks like" it works generically on any table. And for the most part, it probably will, but we're not really following the principles of the interface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this not pirate a default definition?

DataAPI.jl does not define any method for this function. Same for ncol - DataAPI.jl does not define any method for this function.

As I have commented - it does pirate as it add the definition to ncol with ::Any signature. The problem is exactly what you have commented - some tables might not have a well defined row count (maybe also the same problem is for ncol). So what do you think would be best to do? What I would ideally have is some definition that works by default correctly. I think that if default errors it is not a huge problem if it does not error on "standard" tables.

Maybe what we should do is:

  • leave rowcount to be internal;
  • define nrow and ncol only for tables defined in Tables.jl tables (rowtable, coltable, dictrowtable, dictcolumntable, table maybe some more? - you know best)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's what I would prefer (only define for Tables.jl-defined tables. I'm also happy to add recommendations to the docs that table types should strongly consider implementing nrow/ncol on their own table types as apart of the API.

src/fallbacks.jl Outdated
@deprecate rowcount(x) nrow(x)

"Return the number of columns in the incoming table."
ncol(cols) = length(columnnames(cols))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here; columnnames and getcolumn are only technically allowed to be called on the result of Tables.columns or on individual iterated rows from Tables.rows, so this makes me a bit uncomfortable.

@rikhuijzer
Copy link
Author

@bkamins @quinnj Is 5722256 going in the expected direction? Probably needs some more tests and I probably should implement a few more methods?

@rikhuijzer rikhuijzer changed the title Add columncount Add nrow and ncol Jun 30, 2022
@bkamins
Copy link
Member

bkamins commented Jun 30, 2022

@quinnj will know best if you correctly covered everything :), but for matrices things look good :).

src/Tables.jl Outdated Show resolved Hide resolved
@rikhuijzer rikhuijzer requested a review from quinnj July 4, 2022 07:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants